Skip to main content

Geocities Datasets

Sub-collection of Web Archive Datasets


rss RSS

Show sorted alphabetically
Show sorted alphabetically
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
SHOW DETAILS
up-solid down-solid
eye
Title
Date Archived
Creator
Geocities Datasets
by Nick Ruest
data
eye 83
favorite 2
comment 0
Web archive derivatives of the GeoCities collection from the Internet Archive. The derivatives were created with the Archives Unleashed Toolkit . The geocities-aut-parquet-derivatives.xz file, once extracted, produces a directory for each derivative in the Apache Parquet format , which is a columnar storage format . Similarly, the geocities-aut-csv-derivatives.xz file, produces a directory for each derivative in the CSV format. These derivatives are generally small enough to work with on your...
Topics: csv, parquet, gephi, apache spark